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Abstract — Compressed sensing aims at reconstructing sparse 
signals from significantly reduced number of samples, and a 
popular reconstruction approach is ^i-norm minimization. In 
this correspondence, a method called orthonormal expansion is 
presented to reformulate the basis pursuit problem for noiseless 
compressed sensing. Two algorithms are proposed based on 
convex optimization: one exactly solves the problem and the other 
is a relaxed version of the first one. The latter can be considered 
as a modified iterative soft thresholding algorithm and is easy 
to implement. Numerical simulation shows that, in dealing with 
noise-free measurements of sparse signals, the relaxed version 
is accurate, fast and competitive to the recent state-of-the-art 
algorithms. Its practical application is demonstrated in a more 
general case where signals of interest are approximately sparse 
and measurements are contaminated with noise. 

Index Terms — Augmented Lagrange multiplier. Compressed 
sensing, £i minimization, Orthonormal expansion. Phase tran- 
sition, Sparsity-undersampling tradeoff. 



I. Introduction 

COMPRESSED sensing (CS) aims at reconstructing a 
signal from its significantly reduced measurements with 
apriori knowledge that the signal is (approximately) sparse 
|[Tj-||3j. In CS, the signal x° E of interest is acquired by 
taking measurements of the form 

b=Ax° +e, 

where A G M"^^ is the sampling matrix, b E M" is the vector 
of our measurements or samples, e E M" is the measurement 
noise, with N and n being sizes of the signal and acquired 
samples, respectively. A standard approach to reconstructing 
the original signal x° is to solve 

(BPg) min|jjr||j, subject to — Ajr|l2 < e, 

which is known as the basis pursuit denoising (BPDN) prob- 
lem, with ||c||2 < e. Another frequently discussed approach is 
to solve the problem in its Lagrangian form 

(QPJ imnl^X\\x\\, + ^\\b-Ax\\lY 
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It follows from the knowledge of convex optimization that 
(BPf) and (QPa) are equivalent with appropriate choices of e 
and A. In general, A decreases as e decreases. In the limiting 
case of A, e — > 0, both (BPj) and (QP^) converges to the 
following basis pursuit (BP) problem in noiseless CS: 



(BP) 



mm \ x\ 



1 ' 



subject to Ax = b. 



It is important to develop accurate and computationally 
efficient algorithms to deal with the £i minimization problems 
of high dimensional signals in CS, such as an image of 
512 X 512 pixels. One popular approach for solving (QPa) 
is the iterative soft thresholding (1ST) algorithm of the form 
(stating from atq = 0) ||4), ||5) 



xt+i Sx (xt +A'zt) 



Zt 



b -Axt, 



(1) 



where ' denotes the transpose operator and S\{-) is the soft 
thresholding operator with a threshold A, which will be defined 



more precisely in Section II-A 1ST has a concise form and 
is easy to implement, but its convergence can be very slow 
in some cases Q, especially for small A. Other algorithms 
proposed to solve (QP^) include interior-point method |6J, 
conjugate gradient method |7| and fixed-point continuation 
ijS]. Few algorithms can accurately solve large-scale BPDN 
problem (BP^) with a low computational complexity. The £i- 
magic package |9| includes a primal log barrier code solving 
(BPe), in which the conjugate gradient method may not find 



a precise Newton step in the large-scale mode. NESTA |10| 
approximately solves (BP^) based on Nesterov's smoothing 



technique 1 1 1 1, with continuation. 

In the case of strictly sparse signals and noise-free mea- 
surements, many fast algorithms have been proposed to exactly 
reconstruct One class of algorithms uses the greedy pursuit 
method, which iteratively refines the support and entries of 
a sparse solution to yield a better approximation of x°, 
such as OMP (Til, StOMP (T^ and CoSaMP |14|. These 
algorithms, however, may not produce satisfactory sparsity- 
undersampling tradeoff compared with (BP) because of their 
greedy operations. As mentioned before, (QPa) is equivalent 
to (BP) as A — ?► 0. Hence, (BP) can be solved with high 
accuracy using algorithms for (QPa) by setting A to a small 
value, e.g. 1 x 10~^. 1ST has attracted much attention because 
of its simple form. In the case where A is small, however, 
the standard 1ST in ([T]i can be very slow. To improve its 
speed, a fixed-point continuation (FPC) strategy is exploited 
[8], in which A is decreased in a continuation scheme and a 
q-linear convergence rate is achieved. Further, FPC- AS |15| is 
developed to improve the performance of FPC by introducing 



2 



an active set, inspired by greedy pursuit algorithms. An 
alternative approach to improving the speed of 1ST is to use 
an aggressive continuation, which takes the form 



xt+i = Sx^ {xt +A'zf) 



Zt 



b — Axt 



(2) 



where At may decrease in each iteration. The algorithm of this 
form typically has a worse sparsity-undersampling tradeoff 
than (BP) | ,16J . Such a disadvantage is partially overcome 
by approximately message passing (AMP) algorithm 1 17 1, in 
which a modification is introduced in the current residual Zt'- 



xt+i = Sx^ {xt +A'zt) , Zt^b Axt 



N\\xt\\o 



Zt-i, (3) 



where \\x\\q counts the number of nonzero entries of j:. It 
is noted that AMP having the same spasity-undersampling 
tradeoff as (BP) is only established based on heuristic argu- 
ments and numerical simulations. Moreover, it cannot be easily 
extended to deal with more general complex-valued sparse 
signals, though real-valued signals are only considered in this 
correspondence. 

This correspondence focuses on solving the basis pursuit 
problem (BP) in noiseless CS. We assume that AA' is an 
identity matrix, i.e., the rows of A are orthonormal. This is 
reasonable since most fast transforms in CS are of this form, 
such as discrete cosine transform (DCT), discrete Fourier 
transform (DFT) and some wavelet transforms, e.g. Haar 
wavelet transform. Such an assumption has also been used 
in other algorithms, e.g. NESTA. A novel method called or- 
thonormal expansion is introduced to reformulate (BP) based 
on this assumption. The exact OrthoNormal Expansion li 
minimization (eONE-Ll) algorithm is then proposed to ex- 
actly solve (BP) based on the augmented Lagrange multiplier 
(ALM) method, which is a convex optimization method. 

The relaxed ONE-Ll (rONE-Ll) algorithm is further de- 
veloped to simplify eONE-Ll. It is shown that rONE-Ll 
converges at least exponentially and is in the form of modified 
1ST in Q. In the case of strictly sparse signals and noise- 
free measurements, numerical simulations show thatrONE-Ll 
has the same sparsity-undersampling tradeoff as (BP) does. 
Under the same conditions, rONE-Ll is compared with state- 
of-the-art algorithms, including FPC-AS, AMP and NESTA. 
It is shown that lONE-Ll is faster than AMP and NESTA 
when the number of measurements is just enough to exactly 
reconstruct the original sparse signal using li minimization. 
In a general case of approximately sparse signals and noise- 
contaminated measurements, where AMP is omitted for its 
poor performance, an example of 2D image reconstruction 
from its partial-DCT measurements demonstrates that rONE- 
Ll outperforms NESTA and FPC-AS in terms of computa- 
tional efficiency and reconstruction quality, respectively. 

The rest of the correspondence is organized as follows. 
Section [n] introduces the proposed exact and relaxed ONE-Ll 
algorithms followed by their implementation details. Section 
III reports the efficiency of our algorithm through numerical 
simulations in both noise-free and noise-contaminated cases. 
Conclusions are drawn in Section HVl 



II. ONE-Ll Algorithms 

A. Preliminary: Soft Thresholding Operator 

For w € M, the soft thresholding of w with a threshold 
A G is defined as: 

Sx{w)^^gn{w)-{]w\-\) + , 

where (•)+ = max(-,0) and 



sgn [w) 



wj |w| , w ^ 0; 



0, 



u; = 0. 



The operator 5a(-) can be extended to vector variables by its 
element-wise operation. 

The soft thresholding operator can be applied to solve the 
following i'l-norm regularized least square problem |4|, i.e.. 



Sx{w) = argmin \ A \\v\\^ + ^ Ik - vf^ 



where VjH" G M", and S'a(h') is the unique minimizer. 



(4) 



B. Problem Description 

Consider the -minimization problem (BP) with the sam- 
pling matrix A satisfying that AA! = I, where / is an 
identity matrix. We call that A is a partially orthonormal 
matrix hereafter as its rows are usually randomly selected 
from an orthonormal matrix in practice, e.g. partial-DCT ma- 
trix. Hence, there exists another partially orthonormal matrix 
B e lij(^-")x^^ whose rows are orthogonal to those of A, 

"aI 

is orthonormal. Let p = ^x. The problem 



such that $ = 



B 



(BP) is then equivalent to 



(BP°) 



mm \\x\ 

x,p,T(j,)=b 



subject to $jr = /?, 



where V(p) is an operator projecting the vector/; onto its first 
n entries. 

In (BP°), the sampling matrix A is expanded into an 
orthonormal matrix $. It corresponds to the scenario where 
the full sampling is carried out with the sampling matrix $ 
and p is the vector containing all measurements. Note that only 
b, as a part of p, is actually observed. To expand the sampling 
matrix A into an orthonormal matrix $ is a key step to show 
that the ALM method exacdy solves (BP°) and, hence, (BP). 
The next subsection describes the proposed algorithm, referred 
to as orthonormal expansion f i-minimization. 

C. ALM Based ONE-Ll Algorithms 

In this subsection we solve the -minimization problem 
(BP°) using the ALM method flS), (Tp). The ALM method 
is similar to the quadratic penalty method except an addi- 
tional Lagrange multiplier term. Compared with the quadratic 
penalty method, ALM method has some salient properties, e.g. 
the ease of parameter tuning and the convergence speed. The 
augmented Lagrangian function is 



C{x,p,y,n) = + lp- '^x,y) 



<Px\\ 



(5) 
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where Lagrange multiplier 3? G M , /i e M+ and (m,v) = 
m'v e M is the inner product of m, v G M^. Eq. (jsjl can be 
expressed as follows: 



C{x,pj,fi) = \\x\\^ + ^\\p-^x + n 



^ "y\\l (6) 



2pi 



Subsequently, we have the following optimization problem 

iSP): 

(SP) min C{x,p,y,y). 

'c,pV(p)=b 

Instead of solving {SP), let us consider the two related 
problems 

{SPi) min C{x,p,y,fi), 



and 



{SP2) min C{x,p,y,^i). 

pS(p)=b 



Note that problem (SPi) is similar to the £1 -regularized 
problem in (|4]i. In general, {SPi) cannot be directly solved 
using the soft thresholding operator as that in Q since there 
is a matrix product of $ and x in the term of £2-norm. 
However, the soft thresholding operator does apply to the 
special case where $ is orthonormal. Given II'&mH, = ||m||2 
for any u e M^, we can apply the soft thresholding to obtain 

Sf,-^ {p + ^J-^^y)) = argrnin C{x,p,y,^l). (7) 

To solve {SP2), we let c)p^^£(ji:,/?,j, /i) = to obtain T{p) — 
r (^x - H^^y), i.e.. 



r - fi-^y) 



= arg min £{x,p,y,n), (8) 

p,r{p)=b 



where r(.) is the operator projecting the variable to its last 
N — n entries. As a result, an iterative solution of (SP) is 
stated in the following Lemma [T] 

Lemma 1: For fixed y and ^, the iterative algorithm given 

by 



($' {p^ - 
b 

r ( ^x^+'^ - 



(9) 
(10) 



converges to an optimal solution of (SP). 

Proof: Denote £ {x,p,y, fi) as £{x,p), for simplicity. 
By the optimality and uniqueness of x^^^ and /7^+^, we 
< £ (x^ ,p^) and the equality holds if 
(x^,p^). Hence, the sequence 
{£ (jc^,/;-')} is bounded and converges to a constant L*, i.e., 
L* as j +00. Since the sequence {x^} is 



have £ 
and only if 



< £ 



xt as i 



there exists 



also bounded by ||jc-' 

a sub-sequence {x^^}^^ such that jc-' 
where jc* is an accumulation point of {x^ } . Correspondingly, 
P^' ~^ P*s^ ™d moreover, £ (jr*,/;*) = L*. 

We now use contradiction to show that {x'*,p'*) is a 
fixed point of the algorithm. Suppose that there exist 
{Xs,Ps) 7^ (KiPI) such that = argmin^ £ (a:,/?*) and 
p^ =_aTgmm^(j,)^i£{x*,p), and £{Xs,Ps) < L*. By 

-JcJL < 



pJ'+i ^ p^. Hence, £ {x^'+\p^^+^) £{Xs,Ps) < L*, 
which contradicts £ {x^ ,p^) — > L*, resulting in that (jc*,/?*) 
is a fixed point. Moreover, it follows from ||jc-''+^ — Jc!lL < 
\\x^' — — ^ for any positive integer q, that x^ ^■ 

j — > +00. 

A 



s\\2 

xl, as 



Note that orthonormal matrix $ = 
B'B = /. We can obtain 



B 



and $'$ =A'A 



= (A::+A'(i> + Ai"ir(y)-A^:)). (11) 
Meanwhile, (SP) is equivalent to 



min£ x 



b 

r (<I>a: - /_r 



Ajc 



(12) 



2fi 



Ml 



By PI Proposition 3.10], x'* is an optimal solution of the 



12 1 and equivalently, (jc*,/?*) is an optimal 



problem in 

solution of (SP). m 
Remark 1: 

(1) Lemma 1 shows that to solve problem {SP) is equivalent 
to solve, iteratively, problems [SPi) and (S'-Pj). 

(2) Reference |j4j Proposition 3.10] only deals with the spe- 
cial case IIAII2 < 1 and it is, in fact, straightforward to 
extend the result to arbitrary A. 

Following from the framework of the ALM method fT9| 
and Lemma [T] the ALM based ONE-Ll algorithm is outlined 
in Algorithm 1, where [x^^pD is the optimal solution to 
[SP) in the tth iteration wdy^ is the corresponding Lagrange 
multiplier. 

Algorithm 1: Exact ONE-Ll Algorithm via ALM Method 

Input: Expanded orthonormal matrix <& and observed sample data b. 

* ; yl = 0- ,10 >0;t: 



l.xl 







0-pl 



■■ 0. 



2. while not converged do 

3. Lines 4-9 solve J = arg minf^^p^p^p)^;,) C (x,p,y* , ^it); 



4. 
5. 
6. 

7. 



=4,;j?_|_i =pt, j = 0; 
while not converged do 



3+1 
Pt+i 



: S 



i+i + Mt 



b 



j+i 
t+i 



8. set j=j + 1; 

9. end while 

10. y't+i =yl + tit (Pt'+i - 

11. choose fJ.t+1 > /J-t', 

12. set t = t + 1; 

13. end while 
Output: (x*,pI). 



The convergence of Algorithm 1 is stated in the following 
theorem. 

Theorem 1: Any accumulation point {x* ,p*) of sequence 
{(jfj of Algorithm 1 is an optimal solution of (BP°) 

and the convergence rate with respect to the outer iteration 



\\x^' 



\Q\ and 14| Lemma 2.2], we have 
i.e., x^'^^ jf,, as i ^ 



loop index t is at least O [fJ-t^i) in the sense that 
\\\xt\\,-x^=0{t,-\), 
00. Meanwhile, where jc^ = 



4 



Proof: We first show that the sequence {y^ } is bounded. 

we have 



By the optimality of (jCj^^^/'t+i) 
e d^C J*, Ait) = 

= ^(p)^feii/'t+ii3't,Mt 



r(j:+i) 



$'3, 



t+i' 



where denotes the partial differential operator with respect 
to X resulting in a set of subgradients. Hence, ^'jj+i G 



d 



It follows that \Wy 



bounded. By > C {x;^^,plj^^,y% ^Xt), 

1 



< 1 and {y^} is 



<xT - 



1 

By {y^} is bounded. 



<JCT 



(13) 



For any accumulation point jc* of jCj , without loss of generality, 

< jc^. In the 



we have jc^ — > jc* as t ^ +00. Hence, _ 
mean time, PiVi = ^x^+i+^i^^ {y*t+i - y*t) P* and<i>x* 
p* result in that {x*,p*) is an optimal solution to (BP°). 
Moreover, by = $' [/7*+i - fi^^ {y*t+i - y*t)] and 

— in in I 



= mm 
p,r(p)=A 



l^'plli < Wp 



t+i\\i ■ 



o (Mr 



which establishes the 



*i=p,r(p)=i 
we have > x"^ 

theorem with ( [T3| . ■ 
Algorithm 1 contains, respectively, an inner and an outer 
iteration loops. Theorem [T] presents only the convergence rate 
of the outer loop. A natural way to speed up Algorithm 1 is 
to terminate the inner loop without convergence and use the 
obtained inner-loop solution as the initialization for the next 
iteration. This is similar to a continuation strategy and can be 
realized with reasonably set precision and step size nt p9) . 
When the continuation parameter /it increases very slowly, 
in a few iterations, the inner loop can produce a solution 
with high accuracy. In particular, for the purpose of fast and 
simple computing, we may update the variables in the inner 
loop only once before stepping into the outer loop operation. 
This results in a relaxed version of exact ONE-Ll algorithm 
(eONE-Ll), namely relaxed ONE-Ll algorithm (rONE-Ll) 
outlined in Algorithm 2. 

Algorithm 2: Relaxed ONE-Ll Algorithm 

Input: Expanded orthonoimal matrix <I> and observed sample data b. 
* ; yo = 0; fio > 0; t 



1. XQ = 0;pq 







: 0. 



-Mt 



2. while not converged do 
b 

5- yt+\=yt + ^^t(pt+^-^xt+l) 

6. choose fit+i > fJ-t', 

7. set t = t + 1; 

8. end while 

Output: {xt,Pt)- 



Theorem 2: The iterative solution {xt,Pt) of Algorithm 
2 converges to a feasible solution {x^ ,pf) of (BP°) if 
J2t^ ih^ < +00. It converges at least exponentially to 
{x^ ,pf) if {/if} is an exponentially increasing sequence. 



Proof: We show first that sequences {yf.} and {y^} are 
bounded, where _yj = yf_i + ^t-i {pt-i ^^■'^t)- By the 
optimality of Xt+i and Pf^i we have 

e ^^C{xt+l,Pt,yt,^^t) = d\\xt+i\\i- ^'yt+i, 
= 9T(J,)^{xt+l,Pt+l,yu^^t) ^T{yt+i)- 



Hence, ||*'.yt+i|L 
Since y^^^ = y^^^ 



< 1 and it follows that {y^} is bounded. 
+ Mt (pt+i ^Pt)^ we obtain T {y^^^) ^ 



r(jf+i)- This together with F (jj^j^j = results in 
||jt+i||2 < Ilit+iL ™d *e boundedness of {jj. By p^^^ - 
Pt = Mf^M-^t+i -^t+i)' we have \\^^^^~p^\\^ < Cfj.^^ 



I have \\p^^^ ~Pt\\2 - ^t^t 
with C being a constant. Then {p^} is a Cauchy sequence 
if X^tt^ < +00, resulting in p^ ^ p^ as t ^ -\-oo. In the 
mean time, Xt x^, = p^ . Hence, (x^ ,p-^) is a feasible 
solution of (BP°). Suppose that {/it} is an exponentially 
increasing sequence, i.e., fit+i — ?'Mt with r > 1. By the 
boundedness of {y^} and {y^} we have 



-\-oo 



J2 (P'^ -p^+^) 



+ CX3 

i=t 



<C/i, 



+00 

-1 



Hence, {p^} converges at least exponentially to p^ since 
{/i^^} exponentially converges to 0, and the same result holds 
for {xt}. ■ 
Remark 2: It is shown in Theorem 2 that faster growth 
of {jit] can result in faster convergence of {xt}- Intuitively, 
the reduced number of iterations for the inner loop problem 
[SP] may result in some error from the optimal solution 
of the inner loop. This will likely affect the accuracy of the 
final solution for (BP). Therefore, the growth speed of 
{Ht} provides a tradeoff between the convergence speed of the 
algorithm and the precision of the final solution, which will 



be illustrated in Section III through numerical simulations. 



D. Relationship Between rONE-Ll and 1ST 

The studies and applications of 1ST type algorithms have 
been very active in recent years because of their concise pre- 
sentations. This subsection considers the relationship between 
rONE-Ll and 1ST. Note that T (jt) =0 in Algorithm 2 and 
$'$ = A'A+B'B — I. After some derivations, it can be shown 
that the rONE-Ll algorithm is equivalent to the following 
iteration (starting from Xt — 0, as t < 0, and Zt — 0, as 
t < 0): 

xt+i = Sxt {xt +A'zt) , 



Zt=b-A[{l + Kt)xt ~ KtXt-i] + ntZt-i, 



(14) 



where A( — /i^ ^ and Kf ~ 



Compared with the general 



form of 1ST in (j2]), one more term KfZt_i is added when 
computing the current residual Zt in rONE-Ll. Moreover, a 
weighted sum {l + Kt)xt — KtXt^i is used instead of the current 
solution Xt- It will be shown later that these two changes 
essentially improve the sparsity-undersampling tradeoff. 



Remark 3: Equations in (14 1 show that the expansion from 
the partially orthonormal matrix A to orthonormal $ is not at 
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all involved in the actual implementation and computation of 
rONE-Ll. The same claim also holds for eONE-Ll algorithm. 
Nevertheless, the orthonormal expansion is a key instrumen- 
tation in the derivation and analysis of Algorithms 1 and 2. 

E. Implementation Details 

As noted in Remark [3] the expansion from A to $ is 
not involved in the computing of ONE-Ll algorithms. In our 
implementations, we consider using exponentially increasing 
/it, i.e., fixing r > 1 and jit — r^fJ-o- Let Qa{-) be an a- 
quantile operator and /ip ~ 1/Qq (I^'^I)' ^^^^ I' I ^PPlying 
to the vector variable elementwise, /Iq ^ being the threshold 
in the first iteration and a — 0.99. In eONE-Ll, a large 
r can speed up the convergence of the outer loop iteration 
according to Theorem [T] However, simulations show that a 
larger r can result in more iterations in the inner loop. We use 
r = l+ n/A^as default. In rONE-Ll, the value of r provides 
a tradeoff between the convergence speed of the algorithm 
and the precision of the final solution. Our recommendation 
of r to achieve the optimal sparsity-undersampling tradeoff 
is r = min (1 + 0.04n/iV, 1.02), which will be illustrated in 
Section HiFaI 

An iterative algorithm needs a termination criterion. The 
eONE-Ll algorithm is considered converg ed if < 

1^112 

with Ti being a user-defined tolerance. The inner iteration is 

considered converged if \, j., ^ ^ < T2. In our implemen- 

ll-^t II2 

tation, the defauh values are (ri,r2) = (10^^,10"''). The 
rONE-Ll algorithm is considered converged if ^^^y^y^^^^ < r, 
with T = 10^'' as default. 

III. Numerical Simulations 
A. Sparsity-Undersampling Tradeoff 

This subsection considers the sparsity-undersampling trade- 
off of rONE-Ll in the case of strictly sparse signals and 
noise-free measurements. Phase transition is a measure of the 
sparsity-undersampling tradeoff in this case. Let the sampling 
ratio he S = n/N and the sparsity ratio be p = k/n, where k 
is a measure of sparsity of x, and we call that x is fc-sparse 
if at most k entries of x are nonzero. As k,n, N 00 with 
fixed 6 and p, the behavior of the phase transition of (BP) 
is controlled by {S,p) pO) , pTj . We denote this theoretical 
curve by p — pt {S), which is plotted in Fig [T] 

We estimate the phase transition of rONE-Ll using a Monte 
Carlo method as in |17|, ||22|. Two matrix ensembles are con- 
sidered, including Gaussian with N ~ 1000 and partial-DCT 
with N = 1024. Here the finite-A^ phase transition is defined 
as the value of p at which the success probability to recover the 
original signal is 50%. We consider 33 equispaced values of 
S in {0.02, 0.05, • • • , 0.98}. For each S, 21 equispaced values 
of p are generated in the interval [pT (S) — 0.1, pT (S) + 0.1]. 
Then M = 20 random problem instances are generated and 
solved with respect to each combination of {S, p), where 
n = \dN~\, k — [pn], and nonzero entries of sparse signals 
are generated from the standard Gaussian distribution. Success 
is declared if the relative root mean squared error (relative 
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Fig. 1. Observed phase transitions of rONE-LL and comparison with those 
of 1ST. Note that, 1) the observed phase transitions of rONE-Ll with the 
recommended r strongly agree with the theoretical calculation based on (BP) ; 
2) rONE-Ll has significantly enlarged success phases compared with 1ST. 

RMSE) 11 „|| ^ < 10~ , where x is the recovered signal, 
ii-*^ 1I2 

The number of success among M experiments is recorded. 
Finally, a generalized linear model is used to estimate the 
phase transition as in p2) . 

The experiment result is presented in Fig. [T| The observed 
phase transitions using the recommended value of r strongly 
agree with the theoretical result of (BP). It shows that the 
rONE-Ll algorithm has the optimal sparsity-undersampling 
tradeoff in the sense of £1 minimization. 

B. Comparison with 1ST 

The rONE-Ll algorithm can be considered as a modi- 
fied version of 1ST in (j2]). In this subsection we compare 
the sparsity-undersampling tradeoff and speed of these two 
algorithms. A similar method is adopted to estimate the 
phase transition of 1ST, which is implemented using the 
same parameter values as rONE-Ll. Only nine values of 5 
in {0.1,0.2, - •• ,0.9} are considered with the partial-DCT 
matrix ensemble for time consideration. Another choice of 
r = 1 + 0.2(5 is considered besides the recommended one. 
Correspondingly, the phase transition of rONE-Ll with r = 
1 + 0.2S is also estimated. 

The observed phase transitions are shown in Fig. [T] As a 
modified version of 1ST, obviously, rONE-Ll makes a great 
improvement over 1ST in the sparsity-undersampling tradeoff. 
Meanwhile, comparison of the averaged number of iterations 
of the two algorithms shows that rONE-Ll is also faster than 
1ST. For example, as 6 = 0.2 and the recommended r is used, 
rONE-Ll is about 6 times faster than 1ST. 

C. Comparison with AMP, FPC-AS and NESTA in Noise-free 
Case 

In this subsection, we report numerical simulation results 
comparing rONE-Ll with state-of-the-art algorithms, includ- 
ing AMP, FPC-AS and NESTA, in the case of sparse signals 
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TABLE I 

Comparison Results of ONE-Ll Algorithms With 
State-of-the-art Methods 



p 


Method 


# calls A & A' 


CPU time (s) 


Error (10"'') 


0.1 


rONE-Ll 
AMP 
FPC-AS 
NESTA 


1 s 1 Q ^ 1 'ioo 

i.o ly \i jZZjZUJ^- ) 

515.4 (286,954) 
222.7 (216,234) 
150.2 (135,170) 
468.9 (458,484) 


J.OZ, ^^.D/,O.JZJ 

2.14 (1.19,3.92) 
0.80 (0.76,0.86) 
0.50 (0.44,0.56) 
2.70 (2.55,2.98) 


n zL9 (i\ 1 1 n Q^l^ 

U.T-Z yJ. 1 1 ,U.V'4^ 

1.08 (0.53,1.30) 
1.02 (0.85,1.15) 
1.13 (1.07,1.23) 
1.05 (0.99,1.13) 


0.22 


eONE-Ll 

rONE-Ll 

AMP 

FPC-AS 

NESTA 


9038 (7270,11194) 

722.3 (440,972) 
1708 (1150,2252) 

589.4 (476,803) 
1084 (890,1244) 


28.5 (22.0,35.8) 
2.61 (1.63,3.93) 
6.21 (4.19,9.11) 
2.10 (1.65,2.80) 
6.47 (5.22,7.50) 


1.87 (0.46,2.66) 
1.80 (1.37,3.05) 
10.5 (6.96,15.8) 
1.96 (1.46,3.60) 
2.90 (1.62,3.98) 



* Three numbers are averaged, minimum and maximum values, respectively. 



and noise-free measurements. Our experiments used FPC-AS 
v.1.21, NESTA V. 1.1, and AMP codes provided by the author 
We choose parameter values for FPC-AS and NESTA such that 
each method produces a solution with approximately the same 
precision as that produced by rONE-Ll. In our experiments we 
consider the recovery of exactly sparse signals from partial- 
DCT measurements. We set N — 2^* and S — 0.2, and an 
'easy' case where p — 0.1 and a 'hard' case where p = 0.22 
are considered, respectively. Twenty random problems are 
created and solved for each algorithm with each combination 
of {6,p), and the minimum, maximum and averaged relative 
RMSE, number of calls of A and A', and CPU time usages 
are recorded. All experiments are carried on Matlab v.7.7.0 on 
a PC with a Windows XP system and a 3GHz CPU. Default 
parameter values are used in eONE-Ll and rONE-Ll. 

AMP: terminating if "^"lii;*"^ < 10"^ 

FPC-AS: A = 2 X 10"*^ and gtol = 1 x 10"^, where gtol 
is the termination criterion on the maximum norm of sub- 
gradient. FPC-AS solves the problem (QP^)- 

NESTA: A = 2 x 10"®, e = and the termination criterion 
tolvar = 1 X 10~^. NESTA solves (BPg) using the Nesterov 
algorithm | [TT) , with continuation. 

Our experiment results are presented in Table |l] In both 
'easy' and 'hard' cases, rONE-Ll is much faster than eONE- 
Ll. In the 'easy' case, the proposed rONE-Ll algorithm takes 
the most number of calls of A and A', except that of eONE- 
Ll, due to a conservative setting of r. But this number of 
calls (515.4) is very close to that of NESTA (468.9), and 
furthermore, the CPU time usage of rONE-Ll (2.14 s) is 
less than that of NESTA (2.70 s) because of its concise 
implementation. In the 'hard' case, rONE-Ll has the second 
best performance with significantly less CPU time than that 
of AMP and NESTA. AMP has the second worst CPU time 
and the worst accuracy as the dynamic threshold in each 
iteration depends on the mean squared error of the current 
iterative solution, which cannot be calculated exactly in the 
implementation. 



'Here 'easy' and 'hard' refer to the difficulty degree in recovering a 
sparse signal from a specific number of measurements. The setting (5, p) = 
(0.2, 0.22) is close to the phase transition of (BP). 



D. A Practical Example 

This subsection demonstrates the efficiency of rONE-Ll in 
the general CS where the signal of interest is approximately 
sparse and measurements are contaminated with noise. We 
seek to reconstruct the Mondrian image of size 256 x 256, 
shown in Fig. |2] from its noise-contaminated partial-DCT 
coefficients. This image presents a challenge as its wavelet 
expansion is compressible but not exactly sparse. The sam- 
pling pattern, which is inspired by magnetic resonance imaging 
(MRI) and is shown in Fig. |2] is adopted as most energy 
of the image concentrates at low-frequency components af- 
ter the DCT transform. The measurement vector b contains 
n — 7419 DCT measurements {S = 0.113). White Gaussian 
noise with standa rd deviation cr = 1 is then added. We set 
e = \/ n + 2\/2n(T. Haar wavelet with a decomposition level 4 
is chosen as the sparsifying transform W. Hence, the problem 
to be solved is (BP^) with A — J-pW', where is the partial- 
DCT transform. The reconstructed image is H — W'x with x 

being the reconstructed wavelet coefficients and reconstruction 

11//-//° 1 1 

error is calculated as '' , where H is the original 

image and ||-||p denotes the Frobenius norm. We compare the 
performance of rONE-Ll with NESTA and FPC-AS. 

Remark 4: AMP is omitted for its poor performance in this 
approximately-sparse-signal case. For AMP, the value of the 
dynamic threshold A^ and the term ||a;t||Q in ([Sjl depend on the 
condition that the signal to reconstruct is strictly sparse. 

In such a noisy measurement case, an exact solution for 
(BPg) is not sought after in the rONE-Ll simulation. The 
computation of the rONE-Ll algorithm is set to terminate if 
^^'^'H^ll < T = JE\^' rONE-Ll outputs the first Xt when 
it becomes a feasible solution of (BP^). 

FPC-AS: A = 1 X 10"^ gtol = 1 x 10"^ gtol_scale_x = 
1 X 10"® and the maximum number of iterations for subspace 
optimization sub_mxitr — 10. The parameters are set accord- 
ing to |[15] Section 4.4]. 

NESTA: A = 1 x lO'^, and tolvar = 1 x 10"®. The 
parameters are tuned to achieve the minimum reconstruction 
error. 

Fig. [2] shows the experiment results where rONE-Ll, FPC- 
AS and NESTA produce faithful reconstructions of the original 
image. The rONE-Ll algorithm produces a reconstruction 
error (0.0741) lower than that of FPC-AS (0.0809) with com- 
parable computation times (11.1 s and 11.4 s, respectively). 
While NESTA results in a slightly lower reconstruction error 
(0.0649), it incurs about twice more computation time (29.4 
s). 

IV. Conclusion 

In this work, we have presented novel algorithms to solve 
the basis pursuit problem for noiseless CS. The proposed 
rONE-Ll algorithm, based on the augmented Lagrange mul- 
tiplier method and heuristic simplification, can be considered 
as a modified 1ST with an aggressive continuation strategy. 
The following two cases of CS have been studied: 1) exact 
reconstruction of sparse signals from noise-free measurements, 
and 2) reconstruction of approximately sparse signals from 
noise-contaminated measurements. The proposed rONE-Ll 
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Fig. 2. An example of 2D image reconstruction from noise-contaminated 
partial-DCT measurements. Upper left: original Mondrian image; upper right: 
sampling pattern. The lower three are reconstructed images respectively by 
rONE-Ll (lower left, error: 0.0741, time: 11.1 s), FPC-AS (lower middle, 
error: 0.0809, time: 11.4 s) and NESTA (lower right, error: 0.0649, time: 
29.4 s). 



outperforms AMP, which is a well-known 1ST type algorithm, 
in Case 2 and also in Case 1 when the setting of {S, p) is 
close to the phase transition of basis pursuit. It is faster than 
NESTA in both Case 1 and Case 2. The numerical experiments 
further show that rONE-Ll outperforms FPC-AS in Case 2. 
Apart from the high computational efficiency and accurate 
reconstruction result, another useful property of rONE-Ll is 
its ease of parameter tuning. It only needs to set a termination 
criterion r if the recommended r is used and the value of t 
is explicit in Case 2. While this correspondence focuses on 
reconstruction of real-valued signals, it is straightforward to 
apply ONE-Ll algorithms to the reconstruction of complex- 
valued signals p3) . More rigorous analysis of rONE-Ll is 
currently under investigation. 
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